Goto

Collaborating Authors

 degree heterogeneity


Departure from Regularity: Degree Heterogeneity and Eigengap as the Structural Drivers of ASE-LSE Latent Subspace Disagreement

arXiv.org Machine Learning

Two of the most widely used methods for analysing graph data, Adjacency Spectral Embedding and Laplacian Spectral Embedding, often produce different results when applied to the same network. Yet the structural reasons behind this disagreement remain incompletely understood. This paper provides a structural account. We show that regularity is a sufficient condition for perfect agreement: when every node has the same number of connections, the two methods produce identical latent subspaces. Any departure from this regularity introduces disagreement, and we prove an explicit bound whose two terms suggest the structural ingredients controlling it: degree heterogeneity, which pushes the methods apart, and community structure strength, which pulls them back together. We validate both drivers empirically across thousands of simulated networks, confirming that heterogeneity drives disagreement up, community strength suppresses it, and their ratio provides a strong predictor of when the two embeddings can be treated as interchangeable and when they cannot.


Sharp Impossibility Results for Hypergraph Testing

Neural Information Processing Systems

In a broad Degree-Corrected Mixed-Membership (DCMM) setting, we test whether a non-uniform hypergraph has only one community or has multiple communities. Since both the null and alternative hypotheses have many unknown parameters, the challenge is, given an alternative, how to identify the null that is hardest to separate from the alternative. We approach this by proposing a degree matching strategy where the main idea is leveraging the theory for tensor scaling to create a least favorable pair of hypotheses. We present a result on standard minimax lower bound theory and a result on Region of Impossibility (which is more informative than the minimax lower bound). We show that our lower bounds are tight by introducing a new test that attains the lower bound up to a logarithmic factor. We also discuss the case where the hypergraphs may have mixed-memberships.


Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data

Neural Information Processing Systems

We propose a novel class of network models for temporal dyadic interaction data. Our objective is to capture important features often observed in social interactions: sparsity, degree heterogeneity, community structure and reciprocity. We use mutually-exciting Hawkes processes to model the interactions between each (directed) pair of individuals. The intensity of each process allows interactions to arise as responses to opposite interactions (reciprocity), or due to shared interests between individuals (community structure). For sparsity and degree heterogeneity, we build the non time dependent part of the intensity function on compound random measures following Todeschini et al., 2016. We conduct experiments on real-world temporal interaction data and show that the proposed model outperforms competing approaches for link prediction, and leads to interpretable parameters.




Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data

Neural Information Processing Systems

We propose a novel class of network models for temporal dyadic interaction data. Our objective is to capture important features often observed in social interactions: sparsity, degree heterogeneity, community structure and reciprocity. We use mutually-exciting Hawkes processes to model the interactions between each (directed) pair of individuals. The intensity of each process allows interactions to arise as responses to opposite interactions (reciprocity), or due to shared interests between individuals (community structure). For sparsity and degree heterogeneity, we build the non time dependent part of the intensity function on compound random measures following Todeschini et al., 2016. We conduct experiments on real-world temporal interaction data and show that the proposed model outperforms competing approaches for link prediction, and leads to interpretable parameters.



DCMM-Transformer: Degree-Corrected Mixed-Membership Attention for Medical Imaging

arXiv.org Artificial Intelligence

Medical images exhibit latent anatomical groupings, such as organs, tissues, and pathological regions, that standard Vision Transformers (ViTs) fail to exploit. While recent work like SBM-Transformer attempts to incorporate such structures through stochastic binary masking, they suffer from non-differentiability, training instability, and the inability to model complex community structure. We present DCMM-Transformer, a novel ViT architecture for medical image analysis that incorporates a Degree-Corrected Mixed-Membership (DCMM) model as an additive bias in self-attention. Unlike prior approaches that rely on multiplicative masking and binary sampling, our method introduces community structure and degree heterogeneity in a fully differentiable and interpretable manner. Comprehensive experiments across diverse medical imaging datasets, including brain, chest, breast, and ocular modalities, demonstrate the superior performance and generalizability of the proposed approach. Furthermore, the learned group structure and structured attention modulation substantially enhance interpretability by yielding attention maps that are anatomically meaningful and semantically coherent.


Review on Determining the Number of Communities in Network Data

arXiv.org Machine Learning

This paper reviews statistical methods for hypothesis testing and clustering in network models. We analyze the method by Bickel et al. (2016) for deriving the asymptotic null distribution of the largest eigenvalue, noting its slow convergence and the need for bootstrap corrections. The SCORE method by Jin et al. (2015) and the NCV method by Chen et al. (2018) are evaluated for their efficacy in clustering within Degree-Corrected Block Models, with NCV facing challenges due to its time-intensive nature. We suggest exploring eigenvector entry distributions as a potential efficiency improvement.


Heterogeneous Update Processes Shape Information Cascades in Social Networks

arXiv.org Artificial Intelligence

A common assumption in the literature on information diffusion is that populations are homogeneous regarding individuals' information acquisition and propagation process: Individuals update their informed and actively communicating state either through imitation (simple contagion) or peer influence (complex contagion). Here, we study the impact of the mixing and placement of individuals with different update processes on how information cascades in social networks. We consider Simple Spreaders, which take information from a random neighbor and communicate it, and Threshold-based Spreaders, which require a threshold number of active neighbors to change their state to active communication. Even though, in a population made exclusively of Simple Spreaders, information reaches all elements of any (connected) network, we show that, when Simple and Threshold-based Spreaders coexist and occupy random positions in a social network, the number of Simple Spreaders systematically amplifies the cascades only in degree heterogeneous networks (exponential and scale-free). In random and modular structures, this cascading effect originated by Simple Spreaders only exists above a critical mass of these individuals. In contrast, when Threshold-based Spreaders are assorted preferentially in the nodes with a higher degree, the cascading effect of Simple Spreaders vanishes, and the spread of information is drastically impaired. Overall, the study highlights the significance of the strategic placement of different roles in networked structures, with Simple Spreaders driving widespread cascades in heterogeneous networks and Threshold-based Spreaders playing a critical regulatory role in information spread with a tunable effect based on the threshold value. These effects have consequences to our understanding of social phenomena, such as the spread of innovations in heterogeneous social systems with the presence of eager (Simple Spreaders) versus averse (Threshold-based Spreaders) adopters, but also to information warfare on social media where Simple Spreaders can be seen as embedded agents (e.g., bots) used to amplify the virality of ill-intended content and, oppositely, Threshold-based Spreaders as an essential self-regulatory element of social systems operating as information filters.